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ABSTRACT 

The current state of usage of regression models in 
analysis of variance (ANOVA) designs is empirically examined, and 
examples of several statistical errors made in usage are presented. 
The assumptions of the general linear model are that all^predictors 
are known without error of measurement and are fixed with no 
replication or sample variation; in the population, errors are 
normally distributed independently with variance, and errors are 
independent of all predictors. The rules Jof^construction of the 
ANOVA allow the expected mean squares to hold just as if the levels 
of each factor had been randomly sampled. Analysis of Covariance 
(ANCOVA) combines the elements of regression analysis with design, 
albeit in a restricted manner. The homogeneity of regression 
coefficients is a parameter restriction from the design view point. 
The regression weight associated with a given covariate level is 
discussed. Most regression approaches to ANOVA and ANCOVA assume a 
fixed factor model under all design specifications. A maDor oversight 
of expected mean squares has contributed to the current lack ot 
concern for the level of generalizabili ty warranted from the design 
specification. Aptitude treatment interaction models are examined. 
(Author/CM) 
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Misuses of Regression Approaches to ANOVA and ANCOVA 

The instruction of several generations of education graduate students 
in research design statistics was based on Lindquist (1953) and his logi- 
cal succession: Winer (1962, 1971), Kirk (1968), Gla^s and Stanley (1970) 
and others. All stressed ana'l-ysis of variance (ANOVA) using Fisherian 
partition of sums of squares.' There was v^ry little emphasis on regres- 
sion approaches until the appearance of Ward and Jennings' (1973) and 
Kerlinger and Pedhazur's (1973) texts, which use regression models ex- 
clusiveb . These tests have apparently promoted increased use of regres- 
sion models in ANOVA situations in the last several years. Willson (1980) 
reviewed ten years' research in the American Edu cational Research Journal, 
from 1969 to 1978, and found little use of regression approaches. Since 
1978, however the technique has been extensively used, as will be re- 
ported here. It is the purpose of this paper to examine empirically the 
current state of usage of regression models in ANOVA designs and to list 
by example several statistical errors made in this usage. 
Regression Assumptions. It is worth reviewing assumptions of the general 
linear model, for these assumptions will be referenced in light of cur- 
rent practice. Darlington- (1968) has summarized the assumption as 
f ol 1 ows : 

1. All predictors Xi are known without error of measurement; 

2. All predictors Xi are fixed with no replication or sample 
variation.; 
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3. In the population errors are normally distributed independently 
with varianceO"e^ ; 

4. Errors are independent of all Xi . 

Design Assumptions . Cornefield and Tukey (1956) and Millman and Glass 
(1967) presented the rules for construction of the ANOVA table for the 
design in which a levels . of factor A are drawn with equal probabil ity _ 
from possible levels, b levels of • Factor B from Ng possible levels, 
c from and so forth for each factor in the design. Random factors 
are defined for any q < N_, and fixed factors defined for r = N^. In 
each cell abc ... q of the design n elements are #|awn at random from 
a possible N . in the population of elements. This is the urn 

sampling model of Cornfield and Tukey (195&; p. 917 ).« The expected mean 
squares hold for this model just as if the levels of each factor had been 
randomly sampled. 

ANCOVA Assumptions . Analysis of covariance combines the elements of 
regression analysis with design, albeit in a restrictive manner. From 
a regression point of view the covariate must be known without error, 
as must the treatment level (these are coded with 1, 0, or -1 in so- 
called dummy coding). The so-called assumption of homogeneity of re- 
gression coefficients is really just a parameter restriction from the 
design point of view. There could be a different covariate effect at 
each level of a covariate. The covariate can be considered either fixed 
(drug dosage maintenance levels in 100 mg increments from 0 to 1000) or 
random (drug levels in 10 mg increments from 0 to 1000, randomly sampled 
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in stratified 100 mg groups). The regression weight associated with 
a given covariate level could be different for each level ; more commonly 
it is assumed to be identical for all covariate levels, hence the homo- 
geneity of coefficients, which is merely a restrictive form of a general 
linear model (Ward & Jennings, 1973). 

• _ The addition of a so-called covariate. treatment ipteraqtion term can 
. be thought of in design terms as addition of an interaction with (T-l) X{C-.l) 
degrees of freedom where T is the number of treatments and C the 
number of covariate levels. This term is reduced in most analyses to 
T-l degrees of freedom, one parameter per' group. Each group has a dif- 
ferent regression slope. This model is most commonly encountered in 
educational research in the aptitude-treatment interaction models of 
Cronbach and his co-workers (Cronbach & Snow, 1977). 
Fx pected Mean Squares for ANCOVA . The approach to ANCOVA taken in most 
texts (Winer. 1971; Kirk, 1968) is to treat covariates as random factors 
whose variances are removed from sums of squares for the usual ANOVA. 
In effect the expected ^mean squares for ANOVA are conditional on the 
covariates and the expectations are so written (Winer, 1971-, p. 770). 
What is quite clear is that the residual mean square after fit of the 
full model is not the appropriate error term for the covariate or co- 
variates under the usual model. Those who use regression theory have 
rather casually used the, difference between mean squares with and without 
covariate divided by mean square residual as the test of covariates' 
significance. Table 1 shows the mean square expectation table for a 



single covariate-single factor design under usual assumptions, most 
important of which is that treatment effects are conditional on the 
covariate adjustment. 



Insert Table 1 About Here 



.Unreliability of Regression Variables . It was noted earlier that regres- 
sion analysis models assume the predictors to te known ^without error, ^ - : 
When error of measurement is present the assumption is violated. .Glass, 
Peckham and Sanders (1972) have reviewed research on this violatiort 
for ANCOVA models. Rogosa (1977) has examined the effects of unrelia- 
bility on interaction terms in ANCOVA models for confidence interval, 
estimation using the Johnson-Neyr.ian technique. 

Glass et al (1972) concluded that unreliability does affect results 
of ANCOVA in unpredictable ways. They recommend procedures to adjust 
the F-statistic. 

Rogosa (1977) reviewed the literatures on unreliable predictors 
(covariates) for wi thin-group regression (non-homogeneous covariate 
slopes). Power is reduced and the Type I error rate may be changed. 
The situation is even more complex for two predictors (covariates),. and 
distortion of error rates can be even more severe. 

Stochastic Rearession/Covariate Variables . Glass et al (1972) indicate 
there is no serious difficulty when a covariate or predictor is not fixed 
but random. Rogosa (1977) found no effect for ANCOVA models 

with wi thin-group, nonhomogeneous regression slopes. 
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Method 

' All data based studies from three educational research journals 
were surveyed for years 1979 to 1981. The journals were American Educa - , 
tinnal Research Journal , .inurnal of Educational Psych ology., and Educational 
Pu.iM.t inn and Policy Analysis . Studies using ANOVA-type designs with 

— -t . - 

regression- type statistical analyses were examined. Expected mean square 
tabJes were constructed where, possible for each, design using the pr_oced_ures 

■ of Millman and Glass (1967). A comparison was made wiih empirical results 
reported, for each table. Assumptions of the ANOVA and ANCOVA model used 

' were compared with actual practice and discrepancies noted. Of special 
interest were model misspecification. nohhomogenei ty of regression slopes 
for covariates. and unreliable covariates. Model misspecification is 
defined as incorrect specification of a factor or predictor as fixed (or 
random) when common practice or the author's later generalization clearly 
point to the opposite specification. Nonhomogeneity of regression slopes 
refers to the possibility of covariate-treatment interaction which was 
never tested. Unreliability of covariates refers to the presence of un- 
reliable covariates. typically intelligence, achievement, or socioeconomic 
measures. Differential reliability may exist across treatment groups, 
which was not. examined. 

Results 

From all articles published between 1979 and 1981 there were 29 
in AERJ and JEP (none in EEPA) that used a regression approach to ANOVA 
or ANCOVA. Six of these were straightforward fixed factor ANOVAs that 
met all usual ANOVA assumptions: Of the remaining 23 eleven treated 
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factors as fixed that are either usually treated as random or treated 
a factor as fixed and then generalized about the population from whicf] 
it was drawn. Factors thus treated included teachers, classrooms, 
students, and school buildings. 

Eight of the studies made no tests of homogeneity of regr?fession 
slopes in ANCOVA models. - Since many of the remaining stutlie/ were - - - 
aptitute-treatment studies in which this interaction test^^as the major 
thrust of the study, the failure to test was significar^t'^in the remainder. 
It should be noted that few studies using straight l^ioMIK tested for 
homogeneity either. 



Insert Table 2 About Here, 
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Of the twenty-three studies eighteen J;)/d covariates known to be un- 
reliable. Fourteen made no attempt tg t^^t for unequal reliabilities 
across groups, which four made extensile use of general izabil ity theory 
to explore facet general izabil ity. 

In all cases where mixed models should have been used (but one) the 
residual was used as the error/term for all F-tests. In some cases it 
was possible to reconstruct/expected mean square tables under the appro- 
priate mixed rodel. It yas apparent that numerous F statistics would 
change from significance to nonsignif \cance or vice-versa, changing 
interpretations in some instances. Even this first step reanalysis did 
not pursue hierarchical pooling procedures on construct quasi-F's under 
all models. It is possible to say that some studies need reinterpretation 
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Discussion 

Most regression approaches to ANOVA and ANCOVA published in two 
major education journals in the last three years assume a fixed factor 
model under all design specifications. This procedure seems unnecessarily 
thoughtless, although the ease of computation using a computer package 
such as SAS with its PROC GLM Ifiay have contributed to it.. When Qne_ _. 
reviews the most commonly used regression-approach design texts (Kerlinger 
& Pedhazur, 1973; Ward & Jennings, 1973; Cohen & Cohen, 1975) there is no 
mention of expected mean squares in them. This major oversight clearly 
has contributed to the current lack of concern for the level of general- 
izability warranted from the design specification of a study. Cohen and 
Cohen (1975) do attempt to place all designs as fixed, however. 

It is interesting that so many of the studies involved aptitude- 
treatment interaction (ATI). ATI models always involve covariates, the 
aptitudes, and factors, the treatments. The interactions form the major 
tests of interest, and yet in not one study with the exception of Martin, 
Veldman, and Anderson (1980) was a mixed model specifically examined. 
Given the random factors that accompany these studies the ATI's expected 
mean-squares surely involve more complex interactions of the random 
factors with aptitude covariate. This aspect needs immediate address, 
given the high interest in ATI research today. 
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Source 
of 

Variation 
A 
X 
e 
£}x 
£.x 



Table 1 

Expected Mean Square Table for ANCOVA, 
One Covariate, One Factor (Fixed or Random) 



Variance 
Model I (Covariate) 



Model II (No Covariate) 



cr£ix -^"x " A %x ' ^TIV) 



<7a 



Model y.. = V + ?>(x. - p ^) + ot^ + 



y.. = f +o(. + (e X).. 



fe = degrees of freedom for error 



2 = [CoRR ( y, X)]= 
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Table 2 

Journal Articles Using Regression Models for ANOVA 
and ANCOVA with Design Specification Errors and Misuse 

Article Journal Model Slope test Reliability 

(1=AERJ Missnecification 
2=0EP) 

Corno (1979) 1 X X 

Mel i can & Feldt _ 
(1980) 1 X +++ 

Evertson et al „ 

(1980) 1 , , ^ 
* 

Alderman & Powers 

, (1980) 1 X A A 

Greene (1980) 1 X X X 

Peterson et al " „ 

(1980) . 1 X A 

Martin, Veldman & 
Anderson (1980) 1 + X 

Corno et al (1981) 1 ~ X ' # X 

Janicki & Peterson 

(1981) 1 X +++ 



Beady & Hansel 1 

(1981) 1 

Everston et al 

(1981) 1 

Pascarella et al 

(1981) 1 



X X 
+++ X 
X X 



Peterson et al ' . 

(1981) IX X + 

Sharp (1981) 1 XX 
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Article 

1 



Journal 



Slav in (1979) 

Peterson (1979) 

Clark et al (1979) 

Corno (1980) 

Slavin"^ (1980) 

Schunk (1981) 

White et al (1981) 

Ross & Rakow (1981) 

Stinard & Dolphin 
(1981) 



2 

2 
2 
2 
2 
2 
2 
2 



Model 
Mi sspecifi cation 

+ 
X 

X 
X 
X 



Slope test Reliability 



+++ 



X 
X 
X 

+++ 



Legend: 



X = problem 
+++ = addressed or tested 
blank = not a threat or not relevant 



+ unclear form of analysis 
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